Regular expressions, often called pattern in Perl, is a template that either matches or doesn’t match a given string.
- Metacharacters
o When referencing groups, remember that perl takes the last group value
. # match any single character except newline \ # use literal value of next character * # quantifier, match zero or more times () # used for grouping + # match at least one | # or ^ # negation - except for
$_ = "yabba dabba doo baba foo fofo foooo!"; if ( /(.)[^ba]/ ) { print "$1 \n"; } # a - matched: baba, which is last "ba"
- Quantifiers
o * The star (*) meas to match the preceding item zero or more times
o + Match at least once or more
o ? Optional – can match or not match
- Grouping
o Parentheses are used for grouping
$_ = “yabba dabba doo”; if (/y((.)(.)\s\2) d\1/) { print “It Matched!”; }
- Character Classes
o A list of possible characters inside square brackets ( [ ] )
o Example ( [abcdef] ) means to match anyone of those characters
o ( [a-f] ) is another way to write the example above
o There are character class shortcuts and negation shortcuts, such as:
( [0-9] ) # \d ( [a-zA-Z] ) # \w \s # whitespace ( [^\d] ) # \D ( [^\w] ) # \W ( [^\s] ) # \S\
Example Code for Regex:
#!/usr/bin/perl # REGEX print "\nStart Program\n"; # Example Code: $_ = "yabba dabba doo baba foo fofo foooo!"; # set the default input if ( /(.)\1/ ) { print "$1 \n"; } # b - matched: bb if ( /y(....) d\1/ ) { print "$1 \n"; } # abba - matched: abba abba if ( /y(.)(.)\2\1/ ) { print "$2 $1 \n"; } # b a - matched: abba if ( /y(.)(.)\2\3/ ) { print "$2 $1 \n"; } # error! # error: /y(.)(.)\2\3/: reference to nonexistent group at ch7_regex.pl line 14. if ( /(.)\111/ ) { print "$1 \n"; } # no match - looking for string 11 if ( /(.)\1\1\1/ ) { print "$1 \n"; } # o - matched: fooo if ( /(.)[^bar]/ ) { print "$1 \n"; } # a - matched: baba, which is last 'ba' if ( /(.c)|(.f)/ ) { print "$1 $2 \n"; } # _f - matched _foooo, which is last 'f' # Exercise 1 $_ = "Fred Flintsone Harvey Dent Gothem Alfred Wayne Freddy. Kruger\n"; if (/fred/) { print "1 $1 \n"; } # 1 if (/jack/) { print "2 $1 \n"; } # _ if (/[A|a]lfred/) { print "3 $1 \n"; } # 3 if (/alfred/) { print "4 $1 \n"; } # _ if (/\.+/) { print "5 $1 \n"; } # 5